114 research outputs found
Discriminant Projection Representation-based Classification for Vision Recognition
Representation-based classification methods such as sparse
representation-based classification (SRC) and linear regression classification
(LRC) have attracted a lot of attentions. In order to obtain the better
representation, a novel method called projection representation-based
classification (PRC) is proposed for image recognition in this paper. PRC is
based on a new mathematical model. This model denotes that the 'ideal
projection' of a sample point on the hyper-space may be gained by
iteratively computing the projection of on a line of hyper-space with
the proper strategy. Therefore, PRC is able to iteratively approximate the
'ideal representation' of each subject for classification. Moreover, the
discriminant PRC (DPRC) is further proposed, which obtains the discriminant
information by maximizing the ratio of the between-class reconstruction error
over the within-class reconstruction error. Experimental results on five
typical databases show that the proposed PRC and DPRC are effective and
outperform other state-of-the-art methods on several vision recognition tasks.Comment: Accepted by the Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI-18
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Trained with an unprecedented scale of data, large language models (LLMs)
like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities
from model scaling. Such a trend underscored the potential of training LLMs
with unlimited language data, advancing the development of a universal embodied
agent. In this work, we introduce the NavGPT, a purely LLM-based
instruction-following navigation agent, to reveal the reasoning capability of
GPT models in complex embodied scenes by performing zero-shot sequential action
prediction for vision-and-language navigation (VLN). At each step, NavGPT takes
the textual descriptions of visual observations, navigation history, and future
explorable directions as inputs to reason the agent's current status, and makes
the decision to approach the target. Through comprehensive experiments, we
demonstrate NavGPT can explicitly perform high-level planning for navigation,
including decomposing instruction into sub-goal, integrating commonsense
knowledge relevant to navigation task resolution, identifying landmarks from
observed scenes, tracking navigation progress, and adapting to exceptions with
plan adjustment. Furthermore, we show that LLMs is capable of generating
high-quality navigational instructions from observations and actions along a
path, as well as drawing accurate top-down metric trajectory given the agent's
navigation history. Despite the performance of using NavGPT to zero-shot R2R
tasks still falling short of trained models, we suggest adapting multi-modality
inputs for LLMs to use as visual navigation agents and applying the explicit
reasoning of LLMs to benefit learning-based models
Genetic learning particle swarm optimization
Social learning in particle swarm optimization (PSO) helps collective efficiency, whereas individual reproduction in genetic algorithm (GA) facilitates global effectiveness. This observation recently leads to hybridizing PSO with GA for performance enhancement. However, existing work uses a mechanistic parallel superposition and research has shown that construction of superior exemplars in PSO is more effective. Hence, this paper first develops a new framework so as to organically hybridize PSO with another optimization technique for “learning.” This leads to a generalized “learning PSO” paradigm, the *L-PSO. The paradigm is composed of two cascading layers, the first for exemplar generation and the second for particle updates as per a normal PSO algorithm. Using genetic evolution to breed promising exemplars for PSO, a specific novel *L-PSO algorithm is proposed in the paper, termed genetic learning PSO (GL-PSO). In particular, genetic operators are used to generate exemplars from which particles learn and, in turn, historical search information of particles provides guidance to the evolution of the exemplars. By performing crossover, mutation, and selection on the historical information of particles, the constructed exemplars are not only well diversified, but also high qualified. Under such guidance, the global search ability and search efficiency of PSO are both enhanced. The proposed GL-PSO is tested on 42 benchmark functions widely adopted in the literature. Experimental results verify the effectiveness, efficiency, robustness, and scalability of the GL-PSO
- …